IBIS Macromodel Task Group Meeting date: 11 September 2018 Members (asterisk for those attending): ANSYS: Dan Dvorscak * Curtis Clark Cadence Design Systems: * Ambrish Varma Brad Brim Kumar Keshavan Ken Willis eASIC: David Banas GlobalFoundries: Steve Parker IBM Luis Armenta Trevor Timpane Intel: Michael Mirmak Keysight Technologies: * Fangyi Rao Radek Biernacki Ming Yan Stephen Slater Mentor, A Siemens Business: John Angulo * Arpad Muranyi Micron Technology: * Randy Wolff * Justin Butterfield SiSoft: * Walter Katz * Mike LaBonte SPISim: * Wei-hsing Huang Synopsys: Rita Horner Kevin Li Teraspeed Consulting Group: Scott McMorrow Teraspeed Labs: * Bob Ross The meeting was led by Arpad Muranyi. Curtis Clark took the minutes. -------------------------------------------------------------------------------- Opens: - None. ------------- Review of ARs: - Randy to investigate if/why/how a clock waveform input might be used. - In progress. - Michael M. to investigate if/why/how a clock waveform input might be used. - In progress. -------------------------- Call for patent disclosure: - None. ------------------------- Review of Meeting Minutes: Arpad asked for any comments or corrections to the minutes of the September 04 meeting. Walter moved to approve the minutes. Ambrish seconded the motion. There were no objections. ------------- New Discussion: Vref and DDR5 improvements: Arpad suggested we continue the previous week's discussion, and briefly shared the previous week's minutes for an overview. Arpad noted particular interest in two of Ambrish's comments: 1. How much are we losing by keeping things simple and using a differential waveform, and what would we gain by going to single-ended? 2. Ambrish had noted their interest in writing a common clock BIRD, and noted that they felt this was the most important topic to address. Walter noted that he had sent an email attempting to summarize the state of the various DDR5 issues and asked to review it. (Text of email "Summary of DDR5 issues" sent to the ATM group) 1. Asymmetric rise and fall times of a single ended channel. a. Both Cadence and SiSoft believe that this can be done by the EDA tool without any changes to the standard. b. Keysight believes that the standard is incomplete because: i. It does not define how to generate the Impulse Response input to to AMI_Init ii. It does not define how to generate the waveform input to the Rx AMI_GetWave (Fangyi noted this is their primary issue) iii. Or the AMI methodology is invalid for single ended DDR5 DQ channels. 2. Adding DC Offset or replace the Impulse Response Input to AMI_Init with a Step Response a. Both are equivalent since a step response can be derived from an Impulse Response and DC Offset and vice-versa. b. In any case, one of these needs to be done. c. Impact of Tx equalization on the DC offset. (item requested by Fangyi) 3. VrefDQ a. The physical memory DDR5 buffer has a register that must be set by the controller to define the VrefDQ in the chip. b. This will be very close to the DC Offset defined above, but not necessarily so. c. Need to define how an EDA tool handles the impairment caused by the VrefDQ register resolution, and because a single VrefDQ register may control several DQ channels with slightly different DC Offsets. 4. Clock Ticks a. The DQS to DQ skew in the DDR5 memory receiver is defined by the Controller. This skew is determined by simulation, or by a hardware training algorithm. b. One way to handle this is to put a CDR in the memory DQ Rx and assume that this CDR will find, use and report the optimal DQS/DQ phase. c. A possible useful reserved parameter is the DQS/DQ interconnect skew. d. Another way is to have the Controller Tx AMI Model generate clock ticks that the Memory Rx AMI Model reads and uses. A BIRD 147 protocol can be defined between the Tx and Rx to optimize this skew (and the Rx DFE taps as well). 5. Component Based AMI Simulations a. Both Cadence and SiSoft believe that this should be dealt with by the EDA tool. It knows the DQS/DQ interconnect skew for each DQ in a “Component”, and therefor can determine the required skew training parameters or the impairment added to the nibble. Note that a component in this context can be a single memory chip or multiple memory chips in a module. Similarly, the EDA tool knows the Vcent for each DQ channel and can calculate the ideal VrefDQ for the module and the impairment. There is little or no difference between DDR4 and DDR5 in this regard. b. Keysight believes that IBIS AMI needs to be enhanced (or a new methodology) to deal with Component Level AMI Simulations for DDR5. 6. Power Aware Simulations (Arpad raised the subject. Walter noted he had left it off the list originally because it received little attention in the straw poll). In depth discussion of items: 1. Asymmetric rise fall - Walter asked if the summary accurately captured people's statements. Fangyi agreed, and noted that 1.ii. was the primary issue. 2. DC Offset - Fangyi asked if this item contradicted the assertion in 1.a. that no change to the standard was needed. Walter noted that needing one more reserved parameter to pass in the offset was not at the level of "changing the standard." Fangyi asked for item 2.c. to be added. 3. VrefDQ - Walter recapped and noted that VrefDQ is not the same as DC Offset. DC offset is the midpoint between the ends of the step response, which is a simulation result, where VrefDQ has to do with a register value set in memory. There are issues of VrefDQ resolution and how that differs from DC Offset. All of this is independent of Vcent, which is an independent issue related to an eye measurement of all the bits in a nibble. No one disagreed that these were issues to be dealt with. 4. Clock Ticks - Walter noted that DQS to DQ skew in DDR memory is defined by the controller. The skew setting is determined by simulation or by a hardware training algorithm. So, for writes, where the memory is the Rx, the clock is fed to the memory and the skew between clock and data is set by the controller. One way to handle this in simulation is to put a CDR in the memory (DQ Rx model) and assume that the CDR will find, use, and report the optimal DQS to DQ phase at the memory. A Reserved parameter to define the DQS to DQ skew is one possible solution. Another way would be for the controller Tx to generate clock ticks that the memory Rx uses. Ambrish noted that in their solution they don't use a CDR in their Rx model, they use the strobe signal to generate the clock information. Walter asked how the phase between the DQ and DQS was defined. Ambrish said that if the DQ and DQS waveforms were generated concurrently, then there was no need for the phase to be calculated at the Rx. Fangyi noted that there were two related issues at play. The controller adjusts the phase difference between DQ and DQS. That is a single value that is determined during training and persists until the next training. Randy noted that this skew was per DQ, i.e., each DQ has its own skew. Fangyi agreed. Fangyi noted that the second issue is that the clock transition used by the DRAM DFE suffers from jitter in the DQS signal. So, there are two issues. Fangyi said once the single fixed skew is determined (training mode), this could be passed to the Rx, but the Rx will still need to recover clock ticks from the DQS signal. Walter noted that the standard doesn't currently allow for the DQS signal to be passed into the model. Ambrish said the EDA tool could determine the clock ticks from the DQS signal and pass these into the Rx model in the same memory (clock_times argument to GetWave()) currently used by the Rx model to return clock ticks. Walter noted that this too is not currently allowed in the standard. Walter noted that the point of this exercise is to simply agree on the issues, not necessarily the solutions. Review of Walter's views on the BIRDs required to accomplish everything. 1. Define a new parameter DC_offset that represents the mid-point of the start-to-finish range of the step response. 2. Cadence and SiSoft don't think a BIRD is required to address asymmetric rise and fall rates. Walter noted that they may need to convince users and DDR5 model makers that the solutions they've implemented are sufficiently accurate, but there is no need to modify the spec. Fangyi again objected to this. He said we can't just say the tool can do whatever it wants. Walter noted that an EDA vendor could demonstrate that their method gets results that are very close to a full SPICE simulation by doing some fancy convolutions. They could document that method and its accuracy, and then we have a solution. Fangyi asked why we need a standard at all in that case. Ambrish and Walter said the standard tells you that you need a waveform into Rx GetWave(). It doesn't tell you how to generate that waveform. The flow that is given in the standard happens to be valid for differential signaling and has issues with single-ended. We could write a BIRD to define the way to do it for DDR5, or we can say Cadence, SiSoft, etc., each decide to do it their own way. Fangyi said this was analogous to writing a bsim model only to find out that one tool doesn't apply the same fundamental physics to the bsim model that others do. In that case a bsim model would be useless. Walter said this could be a future discussion point. 3. Walter noted that he doesn't think we need a VrefDQ parameter. It becomes a voltage impairment that can be rolled into Rx_Receiver_Sensitivity. Since the model is told what the DC Offset is, it can determine its VrefDQ granularity impairment. We may decide to define a Reserved parameter for VrefDQ, but it's not strictly necessary. 4. DQS to DQ skew - Walter noted that the controller generates the DQ and DQS signals so the controller model could generate the clock ticks. Ambrish disagreed and said the tool generates the DQS waveform, passes it through the DQS channel, captures it at the Rx and gives clock ticks to the DQ Rx. Walter asked what phase is used when the EDA tool gives the DQS waveform to the DQ Rx. Fangyi said the phase that resulted from the training would be built into the DQS waveform. Ambrish noted 90 degrees out of phase with the DQ, for example. Walter asked how the clock tick array is passed into the Rx model by the EDA tool. Ambrish noted the clock_times array would be used as an input. Walter said a BIRD would be needed for that approach. Ambrish agreed and said it's one of the two things his group feels is necessary to address. Fangyi noted another solution was to have the tool simply generate the DQS waveform, and then the DQ Rx model gets both the data DQ signal waveform and the clock DQS signal waveform. Then let the Rx model recover the clock ticks. Fangyi noted that you might have 8 DQs and the one DQS. Randy noted that as a model maker he wouldn't write a model with the complexity to have the entire DQ and data clock chain with all 8 DQs tied together. That was more of a component level of modeling. Randy also noted that if the clock ticks were to be passed into the DQ model as Ambrish proposed, then we really have to understand the phase difference that Walter had mentioned. Randy noted that the models usually assume the training has put the strobe in the best location for the DQ sampling. If the EDA tool were to simply pass in a strobe with an ideal 90-degree offset, then that might not provide the right answers. He said we needed to consider the difference between the EDA tool passing in clock ticks, perhaps with some jitter applied to get the right characteristics, or the DQ model employing some type of fake CDR algorithm to identify the best timing location and then utilizing existing Reserved Parameters for AMI models with CDRs to define the strobe jitter. Fangyi asked if the Rx model should just take in the full DQS waveform as an input and figure out where to clock the DQ, rather than relying on the tool to generate clock ticks. Randy said that requirement would put a lot of extra burden on the model maker. Fangyi said that if you leave it up to the tool then you don't know what the tool is going to do. If you're going to consider putting a CDR in your Rx model instead, why not just pass in the DQS waveform into the model instead. Arpad said there were two independent questions. Whether you send only the DQ (fake CDR) or DQS signal signal into the model, or the EDA tool determines the clock ticks, the correct phase still has to be determined. - Walter: Motion to adjourn. - Randy: Second. - Arpad: Thank you all for joining. ------------- Next meeting: 18 September 2018 12:00pm PT ------------- IBIS Interconnect SPICE Wish List: 1) Simulator directives